13 research outputs found
Novel Datasets, User Interfaces and Learner Models to Improve Learner Engagement Prediction on Educational Videos
With the emergence of Open Education Resources (OERs), educational content creation has rapidly scaled up, making a large collection of new materials made available. Among these, we find educational videos, the most popular modality for transferring knowledge in the technology-enhanced learning paradigm. Rapid creation of learning resources opens up opportunities in facilitating sustainable education, as the potential to personalise and recommend specific materials that align with individual usersâ interests, goals, knowledge level, language and stylistic preferences increases. However, the quality and topical coverage of these materials could vary significantly, posing significant challenges in managing this large collection, including the risk of negative user experience and engagement with these materials. The scarcity of support resources such as public datasets is another challenge that slows down the development of tools in this research area. This thesis develops a set of novel tools that improve the recommendation of educational videos. Two novel datasets and an e-learning platform with a novel user interface are developed to support the offline and online testing of recommendation models for educational videos. Furthermore, a set of learner models that accounts for the learner interests, knowledge, novelty and popularity of content is developed through this thesis. The different models are integrated together to propose a novel learner model that accounts for the different factors simultaneously. The user studies conducted on the novel user interface show that the new interface encourages users to explore the topical content more rigorously before making relevance judgements about educational videos. Offline experiments on the newly constructed datasets show that the newly proposed learner models outperform their relevant baselines significantly
VLEngagement: A Dataset of Scientific Video Lectures for Evaluating Population-based Engagement
With the emergence of e-learning and personalised education, the production
and distribution of digital educational resources have boomed. Video lectures
have now become one of the primary modalities to impart knowledge to masses in
the current digital age. The rapid creation of video lecture content challenges
the currently established human-centred moderation and quality assurance
pipeline, demanding for more efficient, scalable and automatic solutions for
managing learning resources. Although a few datasets related to engagement with
educational videos exist, there is still an important need for data and
research aimed at understanding learner engagement with scientific video
lectures. This paper introduces VLEngagement, a novel dataset that consists of
content-based and video-specific features extracted from publicly available
scientific video lectures and several metrics related to user engagement. We
introduce several novel tasks related to predicting and understanding
context-agnostic engagement in video lectures, providing preliminary baselines.
This is the largest and most diverse publicly available dataset to our
knowledge that deals with such tasks. The extraction of Wikipedia topic-based
features also allows associating more sophisticated Wikipedia based features to
the dataset to improve the performance in these tasks. The dataset, helper
tools and example code snippets are available publicly at
https://github.com/sahanbull/context-agnostic-engagemen
Power to the Learner: Towards Human-Intuitive and Integrative Recommendations with Open Educational Resources
Educational recommenders have received much less attention in comparison with e-commerce- and entertainment-related recommenders, even though efficient intelligent tutors could have potential to improve learning gains and enable advances in education that are essential to achieving the worldâs sustainability agenda. Through this work, we make foundational advances towards building a state-aware, integrative educational recommender. The proposed recommender accounts for the learnersâ interests and knowledge at the same time as content novelty and popularity, with the end goal of improving predictions of learner engagement in a lifelong-learning educational video platform. Towards achieving this goal, we (i) formulate and evaluate multiple probabilistic graphical models to capture learner interest; (ii) identify and experiment with multiple probabilistic and ensemble approaches to combine interest, novelty, and knowledge representations together; and (iii) identify and experiment with different hybrid recommender approaches to fuse population-based engagement prediction to address the cold-start problem, i.e., the scarcity of data in the early stages of a user session, a common challenge in recommendation systems. Our experiments with an in-the-wild interaction dataset of more than 20,000 learners show clear performance advantages by integrating content popularity, learner interest, novelty, and knowledge aspects in an informational recommender system, while preserving scalability. Our recommendation system integrates a human-intuitive representation at its core, and we argue that this transparency will prove important in efforts to give agency to the learner in interacting, collaborating, and governing their own educational algorithms
Towards an Integrative Educational Recommender for Lifelong Learners
One of the most ambitious use cases of computer-assisted learning is to build
a recommendation system for lifelong learning. Most recommender algorithms
exploit similarities between content and users, overseeing the necessity to
leverage sensible learning trajectories for the learner. Lifelong learning thus
presents unique challenges, requiring scalable and transparent models that can
account for learner knowledge and content novelty simultaneously, while also
retaining accurate learners representations for long periods of time. We
attempt to build a novel educational recommender, that relies on an integrative
approach combining multiple drivers of learners engagement. Our first step
towards this goal is TrueLearn, which models content novelty and background
knowledge of learners and achieves promising performance while retaining a
human interpretable learner model.Comment: In Proceedings of AAAI Conference on Artificial Intelligence 202
Predicting Engagement in Video Lectures
The explosion of Open Educational Resources (OERs) in the recent years
creates the demand for scalable, automatic approaches to process and evaluate
OERs, with the end goal of identifying and recommending the most suitable
educational materials for learners. We focus on building models to find the
characteristics and features involved in context-agnostic engagement (i.e.
population-based), a seldom researched topic compared to other contextualised
and personalised approaches that focus more on individual learner engagement.
Learner engagement, is arguably a more reliable measure than popularity/number
of views, is more abundant than user ratings and has also been shown to be a
crucial component in achieving learning outcomes. In this work, we explore the
idea of building a predictive model for population-based engagement in
education. We introduce a novel, large dataset of video lectures for predicting
context-agnostic engagement and propose both cross-modal and modality-specific
feature sets to achieve this task. We further test different strategies for
quantifying learner engagement signals. We demonstrate the use of our approach
in the case of data scarcity. Additionally, we perform a sensitivity analysis
of the best performing model, which shows promising performance and can be
easily integrated into an educational recommender system for OERs.Comment: In Proceedings of International Conference on Educational Data Mining
202
PEEK: A Large Dataset of Learner Engagement with Educational Videos
Educational recommenders have received much less attention in comparison to
e-commerce and entertainment-related recommenders, even though efficient
intelligent tutors have great potential to improve learning gains. One of the
main challenges in advancing this research direction is the scarcity of large,
publicly available datasets. In this work, we release a large, novel dataset of
learners engaging with educational videos in-the-wild. The dataset, named
Personalised Educational Engagement with Knowledge Topics PEEK, is the first
publicly available dataset of this nature. The video lectures have been
associated with Wikipedia concepts related to the material of the lecture, thus
providing a humanly intuitive taxonomy. We believe that granular learner
engagement signals in unison with rich content representations will pave the
way to building powerful personalization algorithms that will revolutionise
educational and informational recommendation systems. Towards this goal, we 1)
construct a novel dataset from a popular video lecture repository, 2) identify
a set of benchmark algorithms to model engagement, and 3) run extensive
experimentation on the PEEK dataset to demonstrate its value. Our experiments
with the dataset show promise in building powerful informational recommender
systems. The dataset and the support code is available publicly
Can Population-based Engagement Improve Personalisation? A Novel Dataset and Experiments
This work explores how population-based engagement prediction can address cold-start at scale in large learning resource collections. This paper introduces i) VLE, a novel dataset that consists of content and video based features extracted from publicly available scientific video lectures coupled with implicit and explicit signals related to learner engagement, ii) two standard tasks related to predicting and ranking context-agnostic engagement in video lectures with preliminary baselines and iii) a set of experiments that validate the usefulness of the proposed dataset. Our experimental results indicate that the newly proposed VLE dataset leads to building context-agnostic engagement prediction models that are significantly performant than ones based on previous datasets, mainly attributing to the increase of training examples. VLE datasetâs suitability in building models towards Computer Science/ Artificial Intelligence education focused on e-learning/ MOOC use-cases is also evidenced. Further experiments in combining the built model with a personalising algorithm show promising improvements in addressing the cold-start problem encountered in educational recommenders. This is the largest and most diverse publicly available dataset to our knowledge that deals with learner engagement prediction tasks. The dataset, helper tools, descriptive statistics and example code snippets are available publicly
Watch Less and Uncover More: Could Navigation Tools Help Users Search and Explore Videos?
Prior research has shown how âcontent preview toolsâ improve
speed and accuracy of user relevance judgements across different information retrieval tasks. This paper describes a novel user interface tool, the Content Flow Bar, designed to allow users to quickly identify relevant fragments within informational videos to facilitate browsing, through a cognitively augmented form of navigation. It achieves this by providing semantic âsnippetsâ that enable the user to rapidly scan through video content. The tool provides visuallyappealing pop-ups that appear in a time series bar at the bottom of each video, allowing to see in advance and at a glance how topics evolve in the content. We conducted a user study to evaluate how the tool changes the users search experience in video retrieval, as well as how it supports exploration and information seeking. The user questionnaire revealed that participants found the Content Flow Bar helpful and enjoyable for finding relevant information in videos. The interaction logs of the user study, where participants interacted with the tool for completing two informational tasks, showed that it holds promise for enhancing discoverability of content both across and within videos. This discovered potential could leverage a new generation of navigation tools in search and information retrieval
Scalable Educational Question Generation with Pre-trained Language Models
The automatic generation of educational questions will play a key role in
scaling online education, enabling self-assessment at scale when a global
population is manoeuvring their personalised learning journeys. We develop
\textit{EduQG}, a novel educational question generation model built by adapting
a large language model. Our extensive experiments demonstrate that
\textit{EduQG} can produce superior educational questions by further
pre-training and fine-tuning a pre-trained language model on the scientific
text and science question data.Comment: To be published at the Int. Conf. on Artificial Intelligence in
Education (Tokyo, 2023
Pre-training with Scientific Text Improves Educational Question Generation (Student Abstract)
With the boom of digital educational materials and scalable e-learning systems, the potential for realising AI-assisted personalised learning has skyrocketed. In this landscape, the automatic generation of educational questions will play a key role, enabling scalable self-assessment when a global population is manoeuvring their personalised learning journeys. We develop EduQG, a novel educational question generation model built by adapting a large language model. Our initial experiments demonstrate that EduQG can produce superior educational questions by pre-training on scientific text